ENT-14108: cf-execd.service: drain cf-agent on stop#6146
Conversation
|
@cf-bottom Jenkins please :) |
|
Alright, I triggered a build: Jenkins: https://ci.cfengine.com/job/pr-pipeline/13857/ Packages: http://buildcache.cfengine.com/packages/testing-pr/jenkins-pr-pipeline-13857/ |
nickanderson
left a comment
There was a problem hiding this comment.
Not sure about the 60s wait. Is it not possible for cf-agent to start cf-execd inside those 60s?
Not sure I follow @nickanderson. Is it not cf-execd that starts cf-agent, and not the other way around? With this fix: when you stop cf-execd, it now waits for cf-agent to finish. If it does not finish within 60 seconds, it gets killed. This is to fix the issue where a lingering agent can start pulling in dependencies again after |
Both things can be true. There is policy in the MPF that watches over CFEngine's own processes. But, this stuff is I think mostly skipped in the case of systemd. But for example: And there are some promises that target systemd, but notice that cf-execd is commented out because FUD. I guess I am wondering what waiting for arbitrary time is really gaining us. If I systemctl stop cf-execd what is the real difference between waiting 2 seconds or 60 seconds neither is based on the actual system state or how long we expect an agent process to take. |
So what you're saying @nickanderson is; why not just kill the agent right away? I.e., instead of waiting for it to finish? |
Maybe I am, I dunno. I am probably just overthinking it. Why not give it at least 60s to finish up that's why I went ahead and approved it. Just it seemed arbitrary and I was looking for meaning. |
`KillMode=process` only signals cf-execd. Any cf-agent spawned by cf-execd keeps running after systemctl stop returns. A mid-run agent can then re-trigger cf-php-fpm (`Wants=cf-postgres`), causing dependencies to be pulled back in after the stop was reported successful. This fix adds `ExecStopPost=` that waits up to 60s for cf-agent to drain, then `SIGKILL`s any survivor. It runs after cf-execd has exited, so no new agents are spawned during the drain. Ticket: ENT-14108 Changelog: cf-execd systemctl stop now waits for in-flight cf-agent to finish Signed-off-by: Lars Erik Wik <lars.erik.wik@northern.tech>
3a97435 to
cd78895
Compare
|
@cf-bottom Jenkins please :) |
|
Alright, I triggered a build: Jenkins: https://ci.cfengine.com/job/pr-pipeline/13879/ Packages: http://buildcache.cfengine.com/packages/testing-pr/jenkins-pr-pipeline-13879/ |
KillMode=processonly signals cf-execd. Any cf-agent spawned by cf-execd keeps running after systemctl stop returns. A mid-run agent can then re-trigger cf-php-fpm (Wants=cf-postgres), causing dependencies to be pulled back in after the stop was reported successful.This fix adds
ExecStopPost=that waits up to 60s for cf-agent to drain, thenSIGKILLs any survivor. It runs after cf-execd has exited, so no new agents are spawned during the drain.Ticket: ENT-14108